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(54) Events management system 



(57) An enhanced system management mode 
(SMM) includes nesting of SMI (system management 
interrupt) routines for handling SMI events. Enhanced 
SMM is implemented in an computer system to support 
a Virtual System Architecture (VSA) in which peripheral 
hardware, such as for graphics and/or audio functions, 
is virtualized (simulated by SMI routines). Reentrant 
VSA/SMM software (handler) includes VSA/SMI rou- 
tines invoked either by (a) SMI interrupts, such as from 
non -virtualized peripheral hardware such as audio FIFO 
buffers, or (b) SMI traps, such as from accesses to mem- 
ory mapped or I/O space allocated to a virtualized pe- 
ripheral function. SMI nesting permits a currently active 
VSA/SMI routine to be preempted by another (higher 
priority) SMI event. The SMM memory region includes 
an SMI header segment and a VSA/SMM software seg- 
ment -- the SMI header segment is organized as a quasi- 
stack into which nested SMI headers are saved. The 
VSA/SMM software manages an SMHR register that 
points to the location for storing the SMI header for a 
currently active VSA/SMI routine if it is preempted by an 
SMI event. To improve performance, the entire SMM re- 
gion is mapped into cacheable system memory. Fea- 
tures that support virtualization include: (a) SMI nesting, 

(b) SMI trapping for memory (as well as I/O) accesses, 

(c) caching both VSA/SMI headers and VSA/SMM soft- 
ware, and (d) configuring the SMM region for storing 
multiple SMI headers at programmable locations. 
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This invention relates lo computer systems 
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registers (such as for video timing, cursor control, color palette, and graphics modes). 

Another problem with current SMM implementations is that they are typically too slow to support hardware virtu- 
alization. SMI handling requires saving processor state, invoking the SMI handling routine, processing the SMI routine, 
and restoring the processor state prior to resuming normal processing. In current SMM implementations, this process 
takes around 6-10 microseconds, while typical peripheral interface functions implemented by peripheral interface hard- 
ware typically take around 1-2 microseconds. 

The related application (3) describes an enhanced SMM implementation that expedites SMM operations by saving 
only that portion of the processor state that will necessarily be modified by every SMM handling operation - if a particular 
SMI handling routine will modify other portions of the processor state, then the SMI handling routine saves those other 
portions of the processor state (special SMM instructions are provided for that purpose). This technique minimizes the 
processor stale information that must be saved and restored, thereby reducing the overhead/latency associated with 
entry and cxitfrom SMM mode. 

Ancthcr technique that has been used to improve SMM performance is to make a portion of SMM space cacheable. 
In pamculnr m one SMM implementation, an SMM handler uses SMM space (which is noncacheable) to store SMM 
header mlcrmntion (including processor state), and then jumps to another region of memory that is cacheable. Thus, 
while tho SMM overhead is not reduced (i.e., the processor state information that must be restored is not cached), 
performance oi the SMM handler is improved by caching. 

Respective aspects of the invention are set forth in the respective independent claims hereof. 
According to a further aspect of the invention there is provided an enhanced system management mode (SMM) 
including SMI nesting The enhanced'SMM is implemented in a computer system that includes a processor and system 
memoiy wiuiu the piocessor support's a system management mode of processing including a system management 
interrupt (SMI ; mechanism that signals SMI ( events.' ~ 

In a preferred form of implementation of the aspect of the invention just set forth, the enhanced system SMM 
includes a reentrant SMM software' handler having for each of a plurality of SMI events a corresponding SMI routine. 
An SMM region is defined in the system memory, and includes an SMI context segment and a segment for the SMM 
handler * ■ 

SMM logic recognizes SMI interrupts and selectively invokes the SMM handler to process corresponding SMI 
routines • \ ■ ' ■ 

For a first SMI event the SMM logic stores first selected processor state information into the SMI context segment 
and invokes the SMM handler to process a corresponding first SMI routine. For a second SMI event that occurs during 
processing of the first SMI routine, the SMM logic store's second selected processor state information into the SMI 
context segment while continuing to maintain the first selected processor state information, and reenters the SMM 
handler to process a corresponding second SMI routine. * 

When the processor completes processing the second SMI routine, the SMM logic restores the second selected 
35 processor state information and then resumes the preempted first SMI routine. 

The SMI segment may be implemented as a quasi-stack in which each SMI event is allocated a- corresponding 
location for storing corresponding selected processor state information. Thus, the first selected processor state infor- 
mation may be stored in a first location, and the second selected processor state information may be stored in a second 
location with the first selected processor state information being maintained in the first location. 

The SMI logic may include a register that stores the address pointer for the next location for storing selected 
processor state information -in response to a next SMI event. 

The SMI handler may be cacheable. For an exemplary embodiment described he rein be low, both the SMI header 
and the SMI handier are cacheable. 

An SMI event may be generated either internal or external to the processor. 
An SMI event may be generated in response to an access to a memory mapped region of memory. 
Embodiments of-tbe invention may be implemented to realize one or more of the following technical advantages. 
The enhanced System Management Mode implements SMI nesting, such as to support virilization of peripheral 
interface functions. SMM mode, including SMI nesting, may be invoked by an SMI signaled in response to a memory 
mapped access, such as to support visualization of graphics functbns. An SMM region, including SMI header/context 
information and reentrant VSA software, may be mapped into cacheable system memory to increase performance (i. 
e. increase throughput and decrease overhead/latency). SMI header location/pointer and the top and bottom of the 
SMM memory region may be precomputed by microcode to reduce latency associated with saving the SMI header, 
and thereby speed entry to the VSA software for servicing an SMI event. An SMRR register may be used to provide 
an interlace between microcode and the VSA software -when nesting is enabled, the VSA software updates the SMHR 
55 register to provide the address of the SMI header location to be used for the next SMI event, and this address is then 
used by the microcode in saving the SMI header in response to a nested SMI event. 

A preferred embodiment described hereinbelow provides a computer system in which a processor implements an 
enhanced system management mode with nesting of SMI events. 
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definition, etc.), and (b) the basic design and operation of such processors and associated computer systems. 

When used with a signal the # symbol designates a signal that is active low, while the / symbol designates the 
complement of a signal. 

The term "vi dualize" means simulating properties or functions of a hardware device or subsystem that would result 
5 during normal processing of an application program so as to obviate such hardware device or subsystem. 

1. Computer System 

Fig. 1 which depicts an exemplary, but not exclusive system, practiced in accordance with the principles of the 
10 present invention. A system circuit board 11 (a.k.a. motherboard) preferably includes buses to couple together a CPU 
10, system memory 36, a RAMDAC/thin film transistor display panel interface 40, L2 cache 44, and chipset logic 
circuitry 49. A multi-tasking operating system program such as Microsoft® Windows™ preferably executes on the CPU 
10 to manage primary operations. 

The CPU 10 preferably includes the following functional units: an internal bus controller 12, a CPU core 14 : a (level- 
's one) LI cache 18 - part of which is partitionable as a scratchpad memory, a memory controller "28, a floating point unit 

(FPU) 16, a display controller 20, an internal SMI generator 21 , a graphics pipeline (a.k.a. graphics accelerator) 22, a 

(level-two) L2 cache controller 24, and a PCI-bus controller 26. 

The bus controller 12, the CPU core 14, the FPU 16, the LI cache 18, and the graphics pipeline 22, are coupled 

together through an internal (with respect to the CPU 1 0) C-bus 30 whose exact configuration is not necessary for the 
20 understanding of the present invention. The bus'controller 12, display controller 20, the graphics pipeline 22, the L2 

cache controller 24, the PCI-bus controller 26, and the memory controller 28 are coupled together through an internal 

(with respect to the CPU 10) X-bus 32. 

The details of the C-bus 30 and X-bus 32 are not necessary for the understanding of the present description.lt is 

sufficient to understand that independent C and X buses 30 and 32 decouple these functional units within the CPU 10 
25 so that for example, the CPU core 14, the FPU 16, and L1 cache 18 can operate substantially autonomously f rom the 

remainder of the CPU 10 and so that other activities (e.g. PCI-bus transfers, L2 cache transfers, and graphics updates) 

can be conducted independently. More specifically, the C-bus 30 has sufficient bandwidth to allow the graphics pipeline 

22 to access the scratchpad memory while the CPU core 14 is performing an unrelated operation. 

The CPU core 14 in the preferred embodiment is a six stage execution pipeline. The exemplary execution pipeline 
30 includes the following stages: 

IF Instruction Fetch a plurality of bytes are fetched into a buffer, 

ID Instruction Decode - decode and scoreboard checks, 

AC1 Address Calculation - - linear address calculations for memory references, 
35 AC2 Operand Access - physical address translation, as well as cache and register file access, 

EX Execution -- instruction execution, and 

WB Writeback - execution results written to register file and write buffers. 

Those skilled in the art, with the aid of the present disclosure, will recognize other number of stages for the pipeline 

40 and other configurations for the CPU core 14 without departing from the scope of the present invention. 

The L1 cache 18 is preferably, although not exclusively, a 16K byte unified data/instruction cache that operates in. 
either a write-through or write-back mode. An area of the L1 cache 1 8 can be programmably partitioned as the scratch- 
pad memory through configuration control registers (not shown) in the CPU core 14. Scratchpad control circuitry in the 
L1 cache 18 includes data pointers which can be used by either the CPU core 14 or the graphics pipeline 22 to access 

45 data in the scratchpad memory. Trie scratchpad memory may also be addressed directly by the CPU core 14. 

An exemplary, but-ftot exclusive, use for the scratchpad memory is as a blit buffer for use by the graphics pipeline 
22. More specifically, whenever data is moved on the display 42, a raster line (scanline) or portion thereof, of data is 
read from the direct-mapped frame buffer 35 (preferably in system memory 36), written to the blit buffer partitioned out 
of the L1 cache 1 8, and then read back out and written to another region of the direct-mapped frame buffer 35. Programs 

50 executed by the CPU core 14 can also directly put data into the blit buffer and have the graphics pipeline 22 autono- 
mously read it out and put it in the direct-mapped frame buffer 35. 

The preferred L1 cache 18, along with other exemplary applications for the scratchpad memory, are described in 
co-pending US patent application Serial No: 08/464,921 , filed June 05, 1995, entitled "Partionabie Cache", assigned 
to the present applicants and herein incorporated by reference. It is to be understood however, that the L1 cache 18 

55 may be larger or smaller in size or may have a Harvard -split" architecture without departing from the scope of the 
present invention. It is also to be understood that the scratchpad memory may be a memory separate for the L1 cache 
18 without departing from the scope of the present invention. 

The graphics pipeline 22 is coupled to the memory controller 28 through a dedicated bus 34 that expedites block 
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externally-signaled SMIs, the chipset 49 includes SMI control registers that must be read by the processor to determine 
the source of an SMI. ;v ' ' - ' 

2.1. VSA Graphics Virtual izat ion 

5 

VSA graphics virtualization is described in the related application (1). In general, for the exemplary VSA imple- 
mentation, VSA graphics virtualization is invoked primarily by internally-signaled memory-mapped and I/O access 
traps. Specifically, in an exemplary implementation, SMI events will be signaled for (a) memory writes to graphics 
regions of the memory map, and (b) for I/O read/write accesses to I/O mapped graphics control registers. 
io Referring to Figure 1 , memory access trapping is performed by the CPU core 14 (during the address calculation 

stage of the execution pipeline), while I/O read/write trapping is implemented in bus controller 12 - because of timing 
considerations, memory mapped accesses are trapped early in the execution pipeline and are not allowed to go out 
on the external bus. in response to a memory-mapped or I/O trap, the CPU core or bus controller signals the SMI event 
to SMI generator 21 , which in turn signals an SMI interrupt to CPU core 14 - in response to the SMI, the VSA software. 
is is invoked which will call the appropriate VSA graphics virtualization routine to service the SMI. The SMM header 
information indicates that the SMI was an internally-signaled graphics access, and whether the SMI resulted from a 
memory-mapped or I/O access (see, Section 3.1.1). 

In addition to the internally-signaled memory-mapped'and I/O trap SMrs, the exemplary VSA graphics virtualization 
can be invoked by externally-signaled SMIs (such as for video timing or postponing display update). 



20 



2.2. VSA Audio Virtualization 



VSA audio virtualization is described in detail in the related application (2). In general, for the exemplary VSA 
implementation, VSA audio virtualization is invoked by externally-signaled I/O access traps and interrupts generated 
2S by audio interface logic. Specifically, SMI events- will be signaled for (a) 1/6 accesses to i/O mapped audio functions 
or hardware (such as audio registers), and (b) Hardware interface functions such as audio FIFO buffer management. 

I/O accesses are trapped in chipset 49, which also includes the audio interface logic (such as the audio FIFO 
buffers). For each SMI. event signaled, the chipset stores in external SMl stat us' registers the corresponding SMI iden- 
tifier code . When the processor takes the SMI and invokes the VSA software^; the SMI status register is read to determine 
30 the source of the external SMI. 1 > : ' v ' r. ■ > 

3.0. Enhanced SMM f 

The Virtual System Architecture implements an enhanced System Management Mode that allows nested VSA/ 
35 SMI routines to support the virtualization of peripheral hardware functions. Other enhancements to conventional SMM 
implementations improve the performance of visualization by reducing SMI overhead/latency. 

For the exemplary enhanced System Management Mode, an SMM region of memory is defined conventionally 
using an SMAR register that holds the base and limit of the SMM region. The SMM region comprises (a) a header/ 
context stack segment for storing multiple nested SMI headers, and (b) a VSA software segment storing reentrant VSA 
40 software including VSA/SMI routines such as the exemplary VSA graphics and VSA audio virtualization routines. 
Specific SMM feature enhancements for the exemplary enhanced System Management Mode include: 

SMM mode may be invoked for memory mapped accesses, as well as I/O mapped accesses and asynchronous 
interrupts, providing maximum flexibility in virtualizing graphics, audio, and other peripheral hardware functions - 
*s the SMI header stores the 32-bit address and 32 -bit data for memory or I/O mapped accesses. 

The VSA softwac@r includes multiple VSA/SMI routines, and is reentrant for multiple nested SMIs. 

The SMM region includes a header/context segment with a quasi-stack arrangement for storing multiple nested 

SMI headers. 

A new SMHR register is defined to provide an address pointer into the header/context segment of the SMM region 
so -- for a currently executing VSA/SMI routine in which nesting has been enabled, SMHR stores the physical address 

pointing to the location for storing, in response to an SMI, the SMI header for that routine. 

The SMHR register is managed by the VSA software, and provides a hardware interface between the VSA software 
and the processor microcode. "*- • ' 

To reduce SMM entry overhead, when SMI nesting is enabled, the processor microcode reads SMHR and precom- 
ss putes and stores a 32-bit pointer address for the next SMI header location. 

The SMI header indicates whether the SMI interrupt was generated internally for a graphics (VGA) access. 
The SMI header indicates whether the SMI interrupt resulted from a trap ; and whether the trap resulted from a 
memory or I/O access. 
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Reserved 
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The SMI header includes the following bit-fields used to define the SMI event that caused the SMI: ^ 

C. code segment writ eable :1 

1 0. for in/read, 1 for out/write . ; ^ '■ ' ' 

P REP instruction * - 

S SMINT instruction m •> ••- ■ ; 

H SMI during CPU hatt ... . ::jr. : ' ' " ' - ■ 

MOforl/O, 1 for memory - « - ■ ; ' ,r ' -' ; 

X External SMI pin 

V Internal graphics access 

N SMI during SMM mode ■ 

In particular, the M and V fields are. used to define an internally-signaled SMI that will be serviced by the VSA graphics 
virilization routine, while the X field indicates that the SMI was signaled externally. 

The N field is an SMM_MODE bit that indicates whether the SMI is a nested SMI, i.e., whether it occurred while 
the processor was in SMM mode. The processor uses the SMM_MODE bit on exit from servicing an SMI event to 
determine whether to stay in SMM- mode or resume normal processing. 

3.1.2. SMHR Register 

The exemplary enhanced System Management Mode provides a new configuration control register SMHR 
that is used to support SMI nesting Specifically SMHR specifies a 32-bit physical SMI header address, and is used 
to define the SMI header location for the next SMI event; 



SMHR Register 
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Note that the exemplary SMHR register configuration is as four separate 8-bit configuration registers: Each time 
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3.3. SMM Caching 

The exemplary enhanced System Management Mode uses caching to improve performance, both by reducing 
overhead/latency in entering and exiting the VSA software; and in increasing throughput in processing VSA/SMI rou- 
5 tines. . 

The conventional SMADS* (SMI address strobe) is not used -- instead, a region of cacheabie system.memory is 
allocated for the SMM memory region. The cacheabie SMM region includes SMI header CPU state and other context- 
dependent information as well as the reentrant VSA software (Figure 3). 

Figures 6a illustrate the exemplary memory mapping scheme for mapping the SMM region into cacheabie system 
10 memory. This memory mapping function is performed by the bus controller 12 and memory controller 28 (see, also : 
Figure 1 ). 

The exemplary 32-bit x86 processor (10 in Figure 1) has an address space 230 of 4 Gbytes. Current computer 
systems provide 4-16 Mbytes of system memory (36 in Figure 1).- the top;bf system mem6ry (DRAM) is indicated at 
231 The region of system memory 640K to 1M is typically reserved for peripheral and BIOS functions -- the 128K 
region from 640 K to 768K is typically reserved for graphics'(see; Section 3.1) memory (VGA), while' the region from 
768K to 1M is used for other peripherals and BIOS functions. 

The address space above the top of system memory 231 (i.e., above- DRAM address space) is commonly used 
for mapping peripheral functions. For example, a graphics peripherals will use a portion of this address space to des- 
ignate special graphics' registers and/or functions. 

The Virtual System Architecture uses a portion of the above-DRAM address space to establish a GX Memory map 
region 232. For the exemplary Virtual System Architecture, the GX Memory map region includes a video frame buffer 
and the SMM region. 

The memory controller 28 remaps the GX Memory map region down into physical memory. The exemplary rem- 
apping approach is to remap the frame buffer to the top of system memory, and to remap the SMM region into the 
640K to 763K region of system memory usually reserved for graphics functions (but which is not heeded by the VSA 
for such functions). * 

The bus controller 12 includes an Xmapper 235 that decodes addresses from the C-Bus (30 in Figure 1) to provide 
various control signals. One such control signal -- GX_MEM -- is use to indicate 1 addresses that are within GX Memory 

30 In response to the assertion of GX_MEM, memory controller' 28 performs'' a 2-bit decode to determine whether the 

address, which is within the GX Memory region 232, is within the' franhe buffer or the SMM region. If the address is 
within the SMM region, the memory controller performs the remapping/to the SMM region 233 in system memory. 

Figure 6b illustrates the cache control operations associated with accesses to the SMM memory region. If the bus 
controller receives (240) an address from the C-Bus, and acache miss is signaled by the cache (18 in Figure 1 ), the 
35 bus controller will decode the address (242) to determine whether a cache line fill cycle should be run. 

If the bus controller decodes the address as within GX Memory (245), it asserts GX_MEM (246) to the memory 
controller. Both the bus controller and the memory controller perform a 2-bit decode of the address to detect whether 
the access is directed to cacheabie SMM memory. 

If the bus controller determines that the address is within the SMM region (252), it will assert an internal KEN# 
*o (cache enable) signal indicating a cacheabie line fill in response to the cache miss. At the same time, the memory 
controller will decode the address as within the SMM region, and perform the remapping to an address within the SMM 
region 233. 
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4. Conclusion 



Although the foregoing Detailed Description has been directed to certain exemplary embodiments, various modi- 
fications of these embodiments, as well as alternative embodiments, will be suggested to those skilled in the art. 

For example, the description of the enhanced System Management Mode in connection with the Virtual System 
Architecture in general, and virtualizing graphics and audio peripheral hardware functions in particular is exemplary 
50 only. Also, the specific implementation of the enhanced SMM, including the specific implementation for SMI nesting, 
including such configuration and control features as the SMHR register, microcode precomputation of SMI header 
location., and the use of the Smll_NEST and SMMJvtODE control signals, is exemplary only. Also, the SMI nesting 
aspect of the invention is applicable to SMI from any source or cause, internally or externally signaled traps (memory 
or I/O accesses) or interrupts. 

55 in addition, specific register structures, mappings, bit assignments', and other implementation details are set forth 

solely for purposes of providing a detailed description of embodiments of the invention in connection with an exemplary 
x86 processor and computer system. 

Also, references to dividing data into bytes, words, double words (dwords), quad words (qwords), etc., when used 
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exemplary. " ons oe 'ween VSA software and the various VSA/5MI routines is 

The invention encompasses an/ modifications or alternative embodnnenls that fall wilhin the scope of the Claims. 
Claims 
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6. The method of Claim 5, further comprising the step of storing an address pointer for a next location for storing 
selected processor state information in response to a next SMI event. 

7. The computer system of any of Claims 1-3 or the method of any of Claims 4-6, wherein the computer system 
includes cache memory, and the SMI handler is cacheable. 

8. The computer system of any of claims 1-3 or the method of any of Claims 4-6, wherein an SMI event can be 
generated either internal or external to to the processor. 

9. The computer system of any of Claims 1-3 or the method of any of Claims 4-6, wherein an SMI event can be 
gcncratcc by a memory mapped access. 
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(54) Events management system 



(57) An enhanced system management mode . 
(SMM) includes nesting of SMI (system management 
interrupt) routines for handling SMI events. Enhanced 
SMM is implemented in an computer system to. support", 
a Virtual System Architecture (VSA) in which peripheral 
hardware, such as for graphics and/or audio functions, 
is virtualized (simulated by SMI routines). Reentrant" 
VSA/SMM software (handler) includes VSA/SMI' rou- 
tines invoked either by (a) SMI interrupts, such as from 
non-virtualized peripheral hardware such as audio FIFO 
buffers, or (b) SMI traps, such as from accesses to mem- 
ory mapped or I/O space allocated to a virtualized pe- 
ripheral function. SMI nesting permits a currently active 
VSA/SMI routine to be preempted by another (higher 
priority) SMI event. The SMM memory region includes 
an SMI header segment and a VSA/SMM software seg- 
ment — the SMI header segment is organized as a quasi- 
stack into which nested SMI headers are saved. The 
VSA/SMM software manages an SMHR register that 
points to the location for storing the SMI header for a 
currently active VSA/SMI routine if it is preempted by an 
SMI event. To improve performance, the entire SMM re- 
gion is mapped into cacheable system memory. Fea- 
tures that support virtualization include: (a) SMI nesting, 

(b) SMI trapping for memory (as well as I/O) accesses, 

(c) caching both VSA/SMI headers and VSA/SMM soft- 
ware, and (d) configuring the SMM region for storing 
multiple SMI headers at programmable locations. 
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(57) Abstract 

A system management mode (SMM) (900) of operating a processor (120) includes only a basic set of hardwired hooks or mechanisms 
in the processor for supporting SMM. Most of SMM functionality, such as the processing actions performed when entering and exiting 
SMM, is "soft" and freely defined. A system management interrupt (SMI) pin (960) is connected to the processor so that a signal on the 
SMI pin causes the processor to enter SMM mode. SMM is completely transparent to alt other processor operating software. SMM handler 
code and data is stored in memory that is protected and hidden from normal software access. 
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